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Abstract: 

We consider parameter estimation in a regression model corresponding to an 
iid sequence of censored observations of a finite state modulated renewal process. 
The model assumes a similar form as in Cox regression except that the baseline 
intensities are functions of the backwards recurrence time of the process and a time 
dependent covariate. As a result of this it falls outside the class of multiplicative 
intensity counting process models. We use kernel estimation to construct estimates 
of the regression coefficients and baseline cumulative hazards. We give conditions 
for consistency and asymptotic normality of estimates. Data from a bone marrow 
transplant study are used to illustrate the results. 
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1. Introduction 

In medical and engineering applications it is common to consider a Markov 
renewal process to model the lengths of time spent in consecutive stages of a 
disease or lifetime of a piece of equipment. Denoting by J = {1, . . . , k} the set 
of possible states, the process is described by a sequence of random variables 
(T, J) = (T m , J m ) m >o, such that To < T\ < T2 < . . . are consecutive times of 
entrances into states Jo, Ji, . . . , J m G X Under assumption of the Markov renewal 
process, the sequence J = { J m : m > 0} of states visited forms a Markov chain 
and given J, the sojourn times Ti, T2 — Ti, . . . are independent with distributions 
depending on the adjoining states. Associated with the sequence (T, J) is a 
counting process {Nij(t) : t > 0,i,j £ J} whose components register each direct 
i — > j transition, 



m>0 

Its compensator {Ajj(i) : t > 0,i,j G J} relative to the self-exciting filtration is 
given by 



where J(t),t > is the state occupied at time t, L(t) = t — T^^ t _yN(t) = 
J2ijeE Nij(t) is the backwards recurrence time, and [Aij(x)]ij & $ is a matrix of un- 
known deterministic functions representing cumulative hazards of one-step tran- 
sitions. Nonparametric estimation of this matrix and the associated semi-Markov 
kernel of the process was considered by Lagakos, Sommer and Zelen (1978), Gill 
(1980), Voelkel and Crowley (1984), and Phelan (1990), among others. 

In this paper we consider estimation in a modulated renewal process, assum- 
ing that components of the counting process {Nij : e J} have intensities of 



where X(s) is a time dependent covariate, Z = {Zij(t) : t > 0, i,j G J} is 
a vector of external transition specific covariates, and [a^] is a matrix of two- 
parameter baseline hazards. A model of this kind may arise for instance in 
medical applications where survival status of a patient is characterized by an 
illness process with baseline intensities dependent on the length of time spent in 





the form 




(1.1) 
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each stage of a disease and a covariate X(s), possibly changing with time. In the 
absence of this covariate, the model reduces to the modulated renewal process 
proposed by Cox (1973) with cumulative intensities 



Both models have several interesting features. The first one is that the event 
times can be viewed as recorded on two simultaneously evolving time scales. In 
the case of (1.2), the covariates depend on the calendar time t, whereas the matrix 
a of baseline hazards depends on the duration scale. In the case of (1.1), the 
latter matrix depends both on the duration and calendar time scale. Further, if 
a corresponds to a matrix of functions depending only on a Euclidean parameter 
6, then estimation of the pair (/?, 6) based on an iid sample of modulated renewal 
processes can be carried out using a counting process framework for analysis 
of maximum likelihood or M estimates. However, if the matrix a is completely 
unspecified, then its nonparametric maximum likelihood estimate falls outside the 
class of statistics taking the form of stochastic integrals with respect to counting 
processes (Gill (1980)). Similarly, in the case of (1.2), estimation of the regression 
coefficient (3 can be in principle based on the solution to the score equation 



where S^ p \t,(3) = £J? =1 l(J<(t-) = i)^ jt {t)e pZi ^\p = 0, 1. However, as a re- 



sult of the dependence of the compensators on the backwards recurrence time, the 
score function in (1.3), evaluated at the true parameter value (3q, fails to satisfy 
the identity E$> n (/3o) = op(l), and consequently the estimate of the regression 
coefficient obtained by solving the equation & n (f3) = cannot be consistent. 
Several authors considered also the special case of the one-jump process (1.1) 
and showed that estimation of regression coefficients requires smoothing (Sasieni 
(1992), Dabrowska, (1997), Nielsen, Linton and Bickel (1998), Pons and Vissier, 



To circumvent difficulties arising in the analyses of renewal processes, Gill 
(1980) and Oakes and Cui (1994) proposed the use of a random time-change 
approach which replaces the calendar time scale t by the duration scale. Here 




(1.2) 




(1.3) 



(2000)). 
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we consider an extension of this approach to analyse a simple case of (1.1), 
assuming that the covariate X(s) is constant between the jumps of the process 
N(t) = Nij(t), an d {Zij(t) : i,j G J, t > 0} is a vector of external covariates. 
In Section 3 we discuss kernel estimation in single-type models. In Section 4 
we give examples multi-type models with a "small" state space to which the 
results can also be applied. We use data from a bone marrow transplant study 
to illustrate the results. 
2. The model 

Throughout the paper we assume that (Q,J-,P) is a complete probability 
space and (T m , V m ) m >o is a marked point process defined on it with marks taking 
on values in a measurable space (E, £) and enlarged by the empty mark A. Thus 
To < T\ < ... T m ... is a sequence of random time points registering occurrence 
of some events in time, and such that T m are almost surely distinct and T m f oo 
P-a.s. At time T m we observe a variable V m such that V rn G E if T rn < oo, and 
V rn = A if T rn = oo. 

For any B G £, let N(t, B) = E m >o l ( T m+i < t, V m+1 € B) be the process 
counting observations falling into the set [0,t] x B. The internal history of the 
process, {^}t>o, represents information collected on N until time t, and is given 
by 

?f = a(l(T m < s, V m G B) : m > 0, s < t, B G £) . 

Then {T^}t>o forms an increasing family of right-continuous cr-fields. Let Tt = 
Tq \l be the self-exciting filtration associated with the process N, obtained by 
adjoining to the internal history of the process, the P-null sets. The compensator 
of the process N(t, B), with respect to Tt is given by 

A(t,B)=A(T m ,B)+ [ nr^Ti a \ fOT f€(T m ,T m+1 ], 

J(T m ,t] Pm{[s, oo); El) A) 

where P m (d(s, v)) is a version of a regular conditional distribution of (T m+ i, V m+ i) 
given T Tm (Jacod (1975)). 

In this paper we assume that the marks V m have the form V m = ( J m , X m , Z m ), 
where J m G J is the state visited at time T m and (Z m , X m ) are covariates taking 
on value in E\ = R d x [0,r],r < oo. The pair (Z m ,X m ) may represent some 
measurements taken upon entrance into the state J m . For any Borel set B of E±, 
let n m+ i(B,t,j) = Pr((Z m+ i,X m+ i) G B\T m+ \ =t,J m+ i = j, (T;, J/, Z h Xi)fl ) 
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and suppose that 

Pr(T m+l -T m <s, J m+1 = j\(T e , J t , Z t , X e )f =0 ) = 

\{J m = i) [ exp[- V I" efrzumWauiv, X m )dv}e^ Zi ^ aij {u, X m )du , 

J[0,s] e J 

where Zij m (u) = f m (u, 7), </;, Z\, X\ : I = 0, . . . , m) is a fixed deterministic func- 
tion f m , left continuous in u. The process Nij(t, B) = ^ m>0 l(T m+ i < t, J m +\ = 
j, J m = i, (Z m+ i,X m+ i) G 5) has compensator given by 

Aij-(t,B) = Kj{T m ,B) 

+ f n m+1 (B,u,j)l(J m = i)e^ z ^ u - T ^ aij {u - T m ,X m )du . 

J(T m ,t] 

In particular, setting B = E\ and using / u m+1 (E' 1 , T m+1 , j')l(T m+1 < oo) = 1, 
A^(t) = ^(t, E) = kij{T m ) + / l(J m = i)e' 3T ^™("- r -)a lJ (n - T m , X m )dn 

J(T m ,t] 

is the compensator of the counting process Nij(t) = Nij(t, E) = ^ m>0 l(T m+ i < 
t,J m+ i = j,J m = i), registering transitions among the adjacent states of the 
model. 

In the following we assume the random censorship model of Gill (1980). 
Thus the times at which the process is observed is determined a process C(s) = 
J2m>i l(Cm-i < t < C m ), where = Co < C\ < . . . < C m ... is an increasing se- 
quence such that C m 6 [T m , T m +i] are stopping times with respect to the history 
{^t}t>o and (C m ) m >o is conditionally independent of {(T m , J m , Z m , X m )} m > 
given (Jo,Zq,Xq). If T m = C m , then no information is available on either the 
sojourn time T m+ \ — T m , the states (J m , J m +i) or the covariates (Z m ,X m ), 
(Z m+1 ,X m+1 ). If C m = T m+ i, then the sojourn time T m+1 - T m , the ad- 
joining states (J m , J m +i) and the covariates (Z m , X m ), (Z m+ i, X m+ \) are ob- 
servable. Finally, if T m < C m < T m+ i, then the state J m and the covariates 
(Z m ,X m ) are visible while the sojourn time T m+ \ — T m is only known to exceed 
C m —T m . We also assume that the censoring process is monotone in the sense that 
T m < C m < T m+ i =>• C m i = T m > for all m! > m . This condition stipulates 
that the process terminates once censoring takes place. To construct estimates 
of the unknown parameters, we use a time transformation which replaces the 
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chronological (or calendar) time scale by the duration scale (Gill (1980), Oakes 
and Cui (1994)). For m > 0, let 

Nij m (v) = l(T m+ i — T m < v, J m = i, J m+ \ = j, T m = C m+ i) , 
Yim{v) = l{T m+1 - T m >v,C m -T m > v, J m = i) , 

M ijm (v) = N ijm {v)- Y im { u y z ^^a(u,X m )du . 

Jo 

Lemma 2.1 Suppose that {(p m (v), m > 0, v > 0} is a sequence of left-continuous 
random functions such that the process if o L(t) = ^ m>0 ^ ra (f — T m )l(T m < 
t < T m+ i) is predictable with respect to the filtration {Ft}t>o and E f °°[(p ° 
L] 2 (s)A ij (ds) < oo. Then 

f'OO f'OO 

EV ip m (u)N ijm (du) = E V / Y m {u) Vm {u)e^ z -^ ai] {u,X m )du 
m Jo m Jo 

f'OO f'OO 

EE/ <p m (u)M ijm (du)] 2 = E / ^mM^He^^Woi^u, X m )du 
In addition, if {ipi m : m > 0} and {992m : > 0} are two such sequences, then 

E E / Plm(«)-Mijm(d«)]E ^2m(u)M klm (du)} = 

for pairs (i, j) 7^ (&,£). 

Much in the same way as in Gill (1980), this lemma follows from the Domi- 
nated Convergence Theorem, martingale properties of the processes My, and 

/ [<poL}(s)C(s)N i:j (s) = V / <p m (u)N m (du) 

J ° m>0 J ° 

POO 

&oL} k ( S )C(s)A l:j (ds) = / ^ m (u) k Y m (u)e^^^) aij ( U ,X m )du. 



m>0" u 

The identies hold almost surely for k = 1,2. We omit the details. 
3. Estimation in single-type event processes 

In this section we assume that all events are of a single type. To estimate the 
baseline cumulative hazard function, we use conditional Aalen-Nelson estimator 
(Beran (1981)) 

1 f v Ni(du,x) 



If 

A(v;x,P) = — 

na Jo 



S {0) t (u,(3,x) 
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where S_j (u, (3, x) = ^ n }^ a YLj^i Sj ( u i Pi x ) an d f° r eac h * = 1, • • • , 
Ni(u, x)=J2 K n(x, X im )N im (u), Sl 0) (u, 0,x)=J2 Y im (u)e^ Zi ^K n (x, X im ) . 

m m 

Here K n (x,w) is the boundary kernel of Miiller and Wang (1994), 

x — IV 

K n (x,w) = l(x — a < w < x + a)Ku( ) if a<x<r — a 

a 

w 

= l(w < (1 + q)a)K lq (q - -) if < x < a 

T — W 

= l(r — a < w < (1 + a)p)K p i( p) if r — a<x<T 

where 

K u (r) = 2C7( / ti)(^) 2At+2 (l + r)"(l - r)* (central region) 

Kpq( r ) = (l e ft boundary region) 

= C^)(^-)^ +2 {p + r)"(? - rY- l [2r{(p - q)fi - q) + M (p - q) 2 + 2g 2 ] 
K pq (r) = (right boundary region) 

= ^(^-/^(P + - M [2r((p - q)n +p) + M (p - q) 2 + 2p 2 ] , 

p, q G (0, 1), and C(/x) = 2(2^ + 1)( 2 ^ 1 ). The kernels are Jacobi polynomi- 
als, and for (p, q) = (1, 1), (1, q) and (p, 1), we have 

/q f-q f-q 

K pq (u)du = 1, / uK P q(u)du = 0, / u 2 K p q(u)du < oo . 
-p J — p J — p 

Table 3.1 gives form of these kernels for polynomials of degree 2,4, and 6. 

Table 3.1 about here 

In the following we assume that (u,x) G 1Z = [0, To] x [0,t],t < oo,r < oo. To 
control the bias of the risk process and the Aalen-Nelson estimator, we need the 
following regularity conditions. 
Condition A 

(i) The variables X; im have densities f m (x) with respect to Lebesgue measure 
on [0,t]. 

(ii) There exists a bounded open neighbourhood B of the true parameter value 
O such that E J2 m {Z im (u)f k Y im {u)exp[(3 T Z im {u)] < oo, for k = 0, 1,2. 



8 



DOROTA M. DABROWSKA AND WAI TUNG HO 



(iii) For k = 0, 1, 2, and (3 6 B, the functions 

s {k \u,f3,w) = ^E([Z mt (n)]^y im (n)exp[/3 T Z im (u)]|X m = w)f m (w) 

m 

are uniformly bounded and twice differentiable with respect to (3. In ad- 
dition Vs^ 0) (u,(3,w) = s (1 \u,(3,w),V 2 s (0 ' ) (u,(3,w) = s^ 2 \u,(3,w), and the 
functions s^ k \u, f3,w),k = 0, 1, 2 are uniformly Lipshitz continuous in [3. 

(iv) The function a(u, w), (u, w) £ TZ is bounded. 

(v.l) The functions s(u, (3, w) = s^ k \u, f3,w),k = 0, 1, 2, and a(u, w) 

satisfy sup{|a(ii, w\) — a(u, W2)\ ■ (u,Wj),£ TZ, \wi — W2\ < a,j = 1,2} = 
0(a) and sup{|s(u, (3, wi) — s(u, (3,W2)\ ■ (u,Wj) £ TZ,\w\ — W2\ < a, (3 £ 
B,j = 1,2} = 0{a). 

(v. 2) s(u,(3,w) and a(u, w) are twice differentiable with respect to w with a 
uniformly bounded second derivatives s" (u, f3,w),a" (u,w) such that 
sup{|a"(u, u>i) — a"(u, W2)\ ■ (u, Wj), £ TZ, \w\ — W2\ < a,j = 1, 2} = 0(1) 
and sup{|s"(n, (3, wi) — s"(u, (3, W2)\ ■ (u, Wj) £ TZ, \w\ — u>2\ < a, (3 £ B, j = 
1,2} = 0(1). 

We refer to this condition as A.l or A. 2, depending on whether the assump- 
tion (v.l) or (v.2) is in force. For k = 1,2, let S^}{u,f3,x) = V k S^ {u, (3, x) be 
the vector and matrix of first and second derivatives of the risk process with 
respect to (3. Set s^(u, (3,x) = a _1 E s\ k \u, (3, x), n(u,x) = KNi(u,x) and 

—r, n . { v n(du;x) 
Jo s^)(u,Po,x) 

Proposition 3.2 Under assumptions A we have ~s^ k \u, (3, x) — s^ k \u, (3,w) = 
0(a r ) for k = 0,1,2, uniformly in (u,x) £ TZ and (3 £ B, and A(v; x, (3q) — 
Aq(v;x) = 0(a r ) uniformly in (v,x) £ TZ. Here r = 1 under condition A.l and 
r = 2 under condition A. 2. 

Proof . Dropping the superscript k, in the central region we have 
1 f x+a x — w f 1 

—KSi(u,P,x)=a~ 1 / K\i( )s(u, (3,w)dw = I Ku(r)s(u, (3,x—ra)dr . 

a Jx~a a J-l 
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In the left and right boundary regions, the expectation a 1 E Si(u, (3,x) is 

rx+a x — W f Q 

aT 1 / K\q{ )s(u, f3,w)dw = / Ki q (r)s(u, (3,x — ra)dr 

Jx-qa a J-l 

rx-pa x — W f 1 

a -1 / K p i( )s(u, (3,w)dw = / K p i(r)s(u, (3,x — ra)dr 

Jx-a a J-p 

In the left boundary region, q = x/a and in the right-boundary region p = 
(r — x)/a. Under condition (v.l), we have |a _1 E Si(u, (3, x) — s(u,(3,x)\ = 0(a), 
uniformly in (u, x) G TZ and (3 G B. Under condition (v.2), we have 

a 2 r q 

a,- 1 RS i (u,P,x)-8(u,P,x) = —s"(u,0,x) / r 2 K pq (r)dr + 0(a 2 ) . 

Similarly n(u, x) = Jq s^°\v, fio, x — ra)a(v, w — ra)K pq (r)drdv . Therefore, 
if one of the two functions (s or a) is Lipschitz of order 1, then n(«, x) — 
Jq s(v, 0o,x)a(v,x)dv = 0(a), whereas if both functions are twice differentiable 
in x, then the bias is 



Y £ {^[s {0 Hv,f3o,x))a(v,x)}}dv J" r 2 K pq (v)dr + 0{a 2 ) 

We also have A(v;x,(3o) — A(v;x) = Jq j(u,x)A(du;x) , where j(u,x) = 
\n(du, x)/s^°\u, 0o, x)a(u, w)] — 1. Thus the bias is of order 0(a r ),r = 1,2 □ 

We turn now to estimation of the regression coefficients. The first method 
corresponds to an M-estimator obtained by solving the score equation <fr n (/3) = 0, 
where 



*n(P) = "EE r[Z im ( U )S^(u,0,X im ) - S ( }}(u,[3,X m )]N m (du) . 
n ™ Jo 



i=\ m J ° 

The analysis of this score equation requires only smoothness conditions A.l and 
second moment bounds on the risk processes. For the sake of convenience, these 
moment bounds are given in the appendix. Let 

,(2) «(1) 



V(u,0,x) = [^-(^nu,P,x) 



Proposition 3.3 Suppose that the conditions A.l and D.2 (i)-(ii) hold. Let 
s i(A)) = J n (V[s^] 2 )(u,p ,x)a(u,x)dudx and E 2 (/%) = 



10 DOROTA M. DABROWSKA AND WAI TUNG HO 

f^(y[s^] 3 )(u, Pq, x)a(u, x)dudx. Suppose that £i(/?o) is a non-singular matrix, 
that na 2 j and no | oo. With probability tending to 1, the score equation 
&((3) = has a unique root (3 and \fn{f3 — j3o) converges in distribution to a 
mean zero normal variable with covariance Sj" 1 (/3o)S2(/3o))[S^ 1 (/3o)] T - 

The proof is given in Appendix D. The next Proposition deals with asymp- 
totic normality of the Aalen-Nelson estimator. We need the following consistency 
assumption on the risk function. 

Condition B Suppose that inf{s(°) {u,(3,w) : u < to, (3 £ B,w G [0 V x — a n , x + 
o„ A r]} > 0. Moreover, that under assumption A.r, r = 1,2, we have 

s (o) _ -(0) 

maxE sup | — l —r-: \(u,(3,x) — ► 

* /3GB,M<r„ S^) 

for a bandwidth sequence a = a n [ such that na f oo and na 2r+l { 0. 

Proposition 3.4 Suppose that conditions A.r(r = 1,2), B and D.l are sat- 
isfied. For any root-n consistent estimate (3 of the parameter (3q, the process 
[■^na[A{v] x, (3) — A(v; x)],v < tq] converges weakly in £°°([0, to]) to a mean zero 
Gaussian process G{v , x) with covariance 

cov[G(v,x),G(v',x)] = d p{xU{x) (K) j MTl^ \ ■ 

J[0,vAv>] s^>[u,Po,x) 

Here r = 1 under condition A.l and r = 2 under assumptions of condition A. 2. 
Moreover, d pq (K) = f^ p K 2 q (w)dw and p{x) = q{x) = 1 if a < x < t — a, 
p = 1, g(x) = a _1 a; if < x < a and p{x) = a -1 (T — x), g(x) = 1 if t — a < x < t. 

Finally, we consider a partial score likelihood estimate of the regression co- 
efficient. It is obtained by solving the the score equation $> n (/3) = 0, where 

i n r° 

*M = "EE / [^™( u ) " -^(u,(3,X tm )]N im (du) . 

71 m i=l ^° S-j 

Note that this score function is similar to that arising in the standard Cox regres- 
sion, except that we use leave-one-out risk processes. The choice of risk processes 
S (k) = Y™ =1 sf\ k = 1,2, is also possible. In both cases the resulting score func- 
tions form an approximate V process of degree 4 and the difference between them 
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converges in probability to 0, but only under stronger moment conditions than 
those considered in the appendix D. 

To analyze the score function <& n (0), we require condition A. 2, moment 
conditions, and the following uniform consistency assumption. 
Condition C Suppose that mf{s(°\u, (3,x) : (u,x) £ TZ, (3 £ £>} > 0. Moreover, 
that 

s (0) _ _(o) 

maxE sup | ~ % -, , /?, x)\ — > 

* (u,x)eK,/3eB s( ' 

for a bandwidth sequence a n [ 0, na^ f oo, na^ J, 0. 

Proposition 3.5 Suppose that conditions A. 2, C, D.2 are satisfied and the ma- 
trix £(/?o) = J n {Vs^)(u, Po,x)a(u,x)dudx is non-singular. With probability 
tending to 1, the score equation & n ((3) = has a unique root (3, and y/n(/3 — (3q) 
converges in distribution to a mean zero normal variable with covariance £ _1 (/?o)- 

The proofs of these propositions are given in Appendices B-D. Similar to 
the approach of Pons and Visser (2000) we use U-process theory. Whereas 
in their setting asymptotic normality results for the estimate f3 were obtained 
based on analysis of U-statistics of degree 2, in our case the term R\ n of their 
Proposition 3 satisfies only i?i Tt = ^/nO p (l) sup^^ jX \ \S^ — ~s^\ (u, (3, x). (Here 
^ = H-^sf.) In the case of one jump processes with bounded time 
independent covariates, say, results of Einmahl and Mason (2000) imply that 
the supremum is of order 0( ^log a^ 1 /na) a.s., so that the term R\ n diverges to 
infinity. In the following we therefore use expansions of higher order. 

Except for moment bounds, the proofs of these propositions do not use any 
special properties of the Z process, and we do not require uniform consistency of 
the derivatives S^J ,k = 1, 2. On the other hand, assumptions B and C require a 
more detailed specification of the covariate Z in order to apply inequalities from 
empirical process theory. The following proposition gives one set of conditions 
under which these assumptions hold. We consider the assumption C only. Let 
T^in = {(u,x) £ TZ : a < x < r — a}, !Z2 n = {(u,x) G TZ : < x < a} and 
TZzn = {(u, x) £ TZ : r—a < x < r}. Let 7i pn = {h(u, (3, x) : (u, x) £ 7Z pn , (3 £ B}, 
p = 1,2, 3, where h(u, 0, x) = 
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s _1 (u 5 f3,x)Y, m Ym(u)e pJ Zm{u) K n (x, X m ). Note that for large n 

s (0) _ _( ) 
maxE sup | — — ^ \(u,fi,x) 

is of the same order as [i pn = E sup{|/i — Eh\ (u,fi, x) : (u,x) £ TZ pn ,(3 £ B}. 

Proposition 3.6 Suppose that for some r > 2 the bandwidth sequence satisfies 
a n I 0, na n f oo b n = log a~ l /(na n ) [ 0, a^bl/ 2 1 = O(l) and there exists a 
random variable i?i n , such that 1) EH± n = 0(1); 

2) \\h(u,(3,x)\\ L2 ( P) < s/a n \\H ln \\ L ^ P) and 3) N^(e\\H ln \\ L ^ P) , Hi n , \\ ■ \\l 2 (p)) < 
[Ae -1 ]^ for some finite constants A and V not depending on n and e 6 (0, 1). 
Then \i\ n = 0{yb n ). If in addition there exist random variables H pn ,p = 2,3, 
such that 4) EH 2 n = 0{a) and 5) N {] (e\\H pn \\ L2{P) ,Hp n , II ■ ||l 2 (P)) < [A^T* 
for some finite constants A p and Vp not depending on n and e 6 (0, 1), then in 
the boundary regions we have \x pn = ©((nan) -1 / 2 ),]? = 2, 3. 

Here || • ||l 2 (p) * s the L^{P) norm, and N^(r),H P n, II ■ IIl 2 (p)) i s the minimal 
number of brackets of L2(P)-size r\ covering the class 7i pn - 

Proof . By Theorem 2.14.2 in van der Vaart and Wellner ((1996), p. 240), in the 
central region we have 

Min < —7=J\\(Va n ^in, II • lli 2 (P)) + a^E H ln l(H ln > y/nc(y/a n )) (3.1) 

where J[](S,H, |HU 2 (P)) = Io[ 1 + 1o S n []( £ \\ h \\l 2 (p)^, II • IIl 2 (p))] 1/2 ^ and c ( 6 ) = 
5\\H\\ L2{P) /[l + logN {] (6\\H\\ L2{P) ,n, || • \\ L2{P) )]- 1 /2 . For 5 = ^fa^ the first term 

of (3.1) is of order 0(\/b n ). Since c(y / a^) = 0(^/ a„/log an 1 )-, the second term 
is bounded by a' 1 (v^c(^ n )) 1_r E H{ n = 0{Vb n )0{a~ l b n /2 - 1 ) = 0{Vb n ). The 
same theorem in van der Vaart and Wellner (1996) implies that in the boundary 
regions we have fi pn = n" 1/2 a~ 1 0( J Q (1, H np , \\ ■ \\ L ^ P) )\\H\\ L ^ P) = 0((na n )~ 1/2 ), 
p = 2,3 □ 

Using a somewhat tedious argument, it is not difficult to show that conditions 
of this proposition are satisfied in the case of covariates not dependent on u. 
Under added envelope conditions, the proposition is also satisfied by Lipshitz 
continuous covariates, covariates that form functions of bounded variation, etc. 
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4. Multi-type event processes 

The results of the previous section extend to the multistate setting provided 
the state space of the process is "small" . An example is provided by an illness- 
death process in which a person in "healthy" state (0) can either progress to a 
"death" state (2), or can first develop a reversible disease (state 1) and subse- 
quently die. In the absence of censoring, the cumulative transitions rates are 
given by 

A i3 (t) = K 3 (T m ) + l(J m = i) f e^ z ^ s - T ^a i3 (s - T m , X m )ds 

J(T m ,t] 

for t € (T m ,T m+ i\. Similarly to multi-type processes in Andersen et al (1993), 
estimation of regression coefficients can be based on the score function 

i n r 

®M = -EEE / l Z ihm(u) - -^(u,(3,X im )]N lhm (du) , 

71 i=l h m J S_ ih 

where the sum extends over pairs h = (0, 1), (0, 2), (1, 2), (2, 1) of possible one- 
step transitions, 

S<® h (u,0,x) = -J2Yjhm(u)eP Tz ^MK n (x,X im ) , 

a h ., . 

and S$ is the derivative of this process with respect to (3. Note that the band- 
width sequence = a n h is taken here to depend on the transition type h. The 
orthogonality relations of Lemma 2.1 imply that the score function is asymp- 
totically normal with covariance matrix Ylh^hiP), where matrices assume a 
similar form as in Proposition 3.5. The M-estimator of Proposition 3.3 provides 
an alternative estimate. 

Another example of a multi-type process is provided by progressive multi- 
state models. In this case a subject may move among a finite number of transient 
states, but each such state can be visited at most once. As an example of such 
a model we consider data on 3020 bone marrow transplant (BMT) recipients for 
acute myelogeneous leukemia (AML) and acute lymphoblastic leukemia (ALL). 
The data were collected by the International Bone Marrow Transplant Registry 
(IBMTR) during the period 1991-2000. Only first transplants in remission are 
considered and all patients received transplant from an HLA-identical sibling. 
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Transplant recipients first receive high doses of chemotherapy and radiation to 
destroy malignant cells in bone marrow and elsewhere. To rescue them from 
the toxicity of this therapy, they subsequently receive bone marrow cells from a 
suitably matched donor. 

In the following we donote by TX the transplant state. It can be followed 
by a number of complications, among them graft-versus-host disease (GVHD), 
relapse and death in remission. Two forms of GVHD are usually distinguished. 
Acute GVHD (AGVHD) occurs in the first 2-3 months following transplant, 
whereas chronic GVHD (CGVHD) occurs later in time. We use time independent 
covariates corresponding to X= square root of patient's age at transplant, and 
binary covariates represting donor-recipient sex-match, (Z), disease type and 
GVHD prophylaxis treatment. The square-root transformation of age serves to 
reduce skewness of the data. Removal of T-cells from the donor's bone marrow 
and posttransplant administration of immune supressive drugs are the major 
GVHD prophylactic treatments. 

We are interested in the dependence of the intensities of one-step transitions 
on age. In Figures 4.1-4.3 we show plots of the baseline cumulative hazards 
Aij(v\x) as functions of x . Note that for fixed x, Aij(v\x) is an increasing 
function of v, but for fixed v this function may assume a variety of forms. Figure 
4.1 shows that cumulative hazards of transitions TX -> AGVHD, TX -> CGVHD 
and AGVHD —>■ CGVHD are increasing functions of age, and this monotonicity 
pattern is most pronounced in the case of transitions into the CGVHD state. The 
cumulative hazards of transitions TX — ► death and CGVHD — > death are both 
U-shaped functions, suggesting higher incidence of death among older and very 
young patients. Finally, the graphs of cumulative hazards of transitions into the 
relapse state are decreasing functions of age, though nearly constant in age in the 
upper tail. Note that in the case of transitions originating from the TX state, 
all 3020 subjects enter into the risk process. However, transitions originating 
from the GVHD states use only those subjects who progress to the AGVHD 
and/or CGVHD state. In particular, a total of 560 patients progressed into the 
CGVHD state. Subsequently 100 developed relapse and 170 died in remission. 
Thus transitions from the CGVHD state are heavily censored. The relatively 
small number of relapses accounts for the noisy graphs of the cumulative hazards 



A MODULATED RENEWAL PROCESS 



15 



of the CGVHD — > relapse state. 

Figures 4.1-4.3 about here 

The regression coefficients for the model are reported in Table 4.1. As in 
any multistate analysis based on the proportional hazard model, the regression 
coefficients do not have a clear meaning. For example, male recipients receiving 
transplant from a female donor are at higher risk for progression from the trans- 
plant state into the AGVHD and CGVHD state, but are also at lower risk for 
direct (one-step) transition from the transplant into the relapse state. The overall 
effect of this covariate on the occurrence of death in remission or relapse cannot 
be, however, directly assessed based on regression coefficients because patients 
who develop AGVHD are at higher risk for death in remission, and also female- 
to-male transplant increases the risk of CGVHD to relapse transition. Likewise, 
the direction of the regression coefficients corresponding to each of the GVHD 
prophylactic treatments varies from one transition to another. Examples of pa- 
rameters which can be used to summarize effects of covariates on the occurrence 
of endpoint events were discussed in Klein, Keiding and Copelan (1993), Arjas 
and Eerola (1993) and Dabrowska, Sun and Horowitz (1994). Their extension to 
the present setting is beyond the scope of this paper. 

Table 4.1 about here 

Appenidx A: Preliminaries 

Let W\ , . . . , W n be iid random variables with some distribution P. An (asym- 
metric) U statistics of degree m, m > 1 is denoted by 

Vn, m (h) = ^^- £ h(W tl ,...,W im ) 
(ii,...,i m )e/™ 

where I™ is the collection of vectors . . . ,i m ) with distinct coordinates, each 
in {1, ... , n}. Assuming that the kernel h satisfies E . . . , W m )\ < oo, the 

Hoeffding projection of degree m of the kernel h is denoted by Tr m h(Wi, . . . , W m ). 
We have n m h{W u . . . , W m ) = E^i,..,™}!- 1 )™"'^^!, • • • , W m ) , where 
for ^ A = . . . ,i p }, 1 < p < m, denotes conditional expectation with 
respect to variables {Wj,j G A} and E /i(Wi, . . . , W m ) = E h(Wi, . . . , W m ). 
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Then U„ im (7r m /i) forms a canonical U statistics of degree m. For canonical U- 
processes indexed by classes of kernels changing with n, Lemma 3.5.2, Remarks 
3.5.4 and inequality (5.4.3) in de la Peha and Gine (1999) provide the following. 

Lemma 5.7 Let {U„ im (/i) : h G 7i n } be a canonical U-process over a measurable 
class class 7i n of (asymmetric) kernels of degree m. If H n forms a Euclidean class 
of functions for a square integrable envelope H n , then En m / 2 ||U nim (/i)||-^ n 
= 0(E [H n {W l ,...,W m fYl 2 ). 

A measurable class of functions TC defined on some measure space (fl,A) 
is Euclidean for envelope H is h < H for all h G H, and there exist constants 
A and V such that N(e\\H\\ L ^ P) ,H,\\ ■ \\l 2 (p)) < (A/e) v for all e G (0,1) and 
all probability measures P such that ||-£f ||l 2 (p) < oo (Nolan and Pollard, 1987). 
Here || • ||l 2 (p) i s the ^(P) norm and N(rj, Tt, \\ • \\l 2 (p)) is t ne minimal number of 
L2(P)-bals of radius rj covering the class H. In the case of classes H n changing 
with n, the Euclidean constants A and V are taken to be independent of n. 

In the following we shall use U processes of degree m < 1,2,3,4. Finally, 
in our case for each subject i, the sequence Wj represents the total number of 
events observed in the interval [0, To], their times of the occurrence, types and 
covariates observed at each jump time. The Euclidean property of the classes 
of functions appearing in the remainder of the text can be easily verified based 
on results of Nolan and Pollard (1987), Pakes and Pollard (1989) and Gine and 
Guillou (1999). 

Appendix B: Regularity conditions and two lemmas 

We give some additional regularity conditions. 
Conidtion D.O (i) For sequences (m) = (mi, 7712), mi 7^ m,2 of nonnegative 
integers the variables A( m ) = (X mi , X m2 ) have joint density /( m ) with respect to 
Lebesgue measure on [0, r] 2 . 

(ii) For sequences [m] = (mi, 1712,1713) of distinct nonnegative integers, the 
variables A[ m ] = (X mi , X m , 2 , X ma ) have joint densities /[ m ] with respect to 
Lebesgue measure on [0, r] 3 . 

For any vector, we denote by | • | the t\ norm. Without loss of generality we 
assume that the neighbourhood B surrounding the true parameter /?o corresponds 
to a ball B = {13 : |/3-/3 | <c B }- 
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For nonnegative integers p and m define 6 m (u) = \Z m (u)\ p Y m (u)e^ { ^ +CB ^ Zm ^ 
fmdO$(u,P) = \Z m (u)\PY m (u)e^ z ^ u l For u £ [0, r ], u = (u u u 2 ) £ [0,t ] 2 ,u = 
(ui,u 2 ,u^) £ [0, r ] 3 , and it; £ [0,r], w = (wi,w 2 ) G [0,t] 2 ,iZ7 = (11^, iw 2 , W3) G 
[0,r] 3 , let 

2 



cr piiP2 (u,W 



Y J n\\oZ 3 \u J )\x m = W ]f m (w), 

m j=l 
2 

^E[]Je%\u J )\x im) = w}f {m) (w) , 

(m) j=l 

3 

A)) I] A))|^m = H/mH , 

m j=2 
2 

^E^K.^n^^.A))!^) = ™]/(m)(™) , 
(m) j=l 
3 

M j=2 
(m) 

2 E [0$ (u, /3 ) |X [m] = ™]/ [m] (W) . 



Under conditions D.l and D.2 these expectations exist, at least in local neigh- 
bourhoods of a point x £ [0, r]. Such local neighbourhoods correspond to sets 
IZ(x) = {(it, iu) G 7£ : |u/ — x| < a}. 

Conidtion D.l (i) The condition D.O (i) is satisfied and for integers pi,p 2 such 
that pj > 0,pi +P2 < 4, we have 

sup{cTp 1)P2 (n,it)) : (ui,w) £ Tl(x), (u 2 ,w) £ U(x)} = 0(1) . 
sup{\p pl , P2 (u,w))\ : (uj,Wj) £ K(x),j = 1, 2} = 0(1) . 

(ii) The condition D.O (ii) is sastisfied, and 



sup{Ki ; o(u, w) 
sup{K2;o(w, w) 

SUp{K 3;0 (li, W) 



( Uj ,w) £ K(x),j = 1,2,3} = 0(1) , 

(U!,Wl) £ K(x), (U2,W 2 ) £ U{x), (u 3 ,wi) £ K(x)} = 0(1) 
( Uj , Wj ) £K(x),j = 1,2,3} =0(1) , 
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sup{s ;2O, w) ■ {u,Wj) G TZ(x),j = 1,2} = 0(1) , 
sup{s ;3(«,^) : (u,Wj) G K(x),j = 1,2,3} = 0(1) . 

Condition D.2 (i) The condition D.O (i) is satisfied and, for integers pi,P2 such 
that pj > 0,pi + P2 < 4, we have 

sup{a PuP2 (u,w) : (ui,w) G TZ, (u 2 ,w) G TZ} = O(l) , 
swp{\p PUP2 (u,w))\ : {uj,Wj) G TZ, \w 2 - u>i| < a,j = 1,2} = O(l) . 

(ii) The condition D.O (ii) is satisfied and, for p = 0, 1, we have 

sup{K 1;p (u,w) : (uj,w) G TZ,j = 1,2,3} = O(l) , 

swp{K2;p(u,w) : (ui,wi) G TZ, (u2,w 2 ) G TZ, (u 3 ,wi) G TZ, \w 2 - wi\ < a} = 0(1) , 
swp{K 3 - p (u,w) : (uj,Wj) G TZ, \w 2 — wi\ < a, \w 3 - w 2 \,j = 1, 2, 3} = O(l) . 

We now give two lemmas which collect bounds on certain random variables 
arising in the analysis of the Aalen-Nelson estimate. Both can be verfied using 
elementary algebra, Holder's inequality and conditions A and D. 

Lemma 6.8 Suppose that inf{s (0) (n, (3, x) : f3 G B,u < r } > 0. For k = 0,1,2, 
let J kni (u,f3,x) = [^KA^-'Em^K^K^^m)! and f* kni (u,(3,x) = 
[s^\u,(3,x)\- 1 Y, m e{k \ u ^)^ X im)\K n {x,X im )\. If conditions A and D.l (i) 

hold, then a-^n^iJfcpmK'M = °W and a ^ E l5=i fk p m(^ M = 
0(1), uniformly in u\,u 2 < To and G B. If in addition the condition D.l (ii) 
holds, then a _1 E Y\ 3 p=1 f kpni (u p , (3 , x) = O(l) uniformly in ui,u 2 ,u 3 < tq. If 
inf (u, P, x) : (5 G B,(x,u) G TZ} > and conditions D.2 hold, then these 
bounds are also uniform in x,x G [0, r]. 

Lemma 6.9 Supose that inf{s^(n, (3q, w) : u < r, f3 G B,w G [x — a n V 0, x + 
a„ A r]} > 0. Set 

7 ni («,x) = [s^(u,x)]- 2 [sf\u,x) + S ( i 1 \u,x)s^\u,x)]. 

Sf\u,x) = J2^m^)\Kn(x,X im )\, 
m 

s {0 \u,x) = ^EY im (u)exp([-|/3 | - c B ]\Z im (u))\X im = x)f m (x) , 

m 

sW(n,x) = ^E(^ ) |X,m = ^)/mW , 
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nv 

m J ° 

g in (u,(3,x) = T \K n (x,X im )\[s^{u^,x)]- l N tm (du) 
™ Jo 



and let 



/•TO 

H 0n {Wi) = a~ 1/2 [g ni (T ,p ,x) + / /o ni (u, (3 , x)du] , 

Jo 



x \a(u, Xi rn ) — a(u, x)\du , 

(0) _ s (0)| 



l Z" 7 " - 

H2n(Wi,Wj) = — -= / f 0ni (u,(3 ,x)g nj (u,p ,x) , 



^3n(Wi) = a I J (u,Po,x)g ni (du,p ,x) , 

H 4n (W t ) = a" 1 / 2 [ TO f ni (u,(3o,x)\a(u,x)dv EA( ^"-' r) 
Jo 



1 /" ro - 

#5n(Wj,Wj) = — / flni{ u ,Po,x)g n j{du,(3 ,x), 



na- JQ 



+ I fni(u,x)g nj (du,x) . 

na j 



If conditions A.r(r = 1,2) and D.l hold, then Eff^(Wi) = 0(1), Efl&,(Wi) = 
©(a- 1 / 2 ), E<(W!) = 0(a 2 ), E# 3 n(Wi) 2 = 0(a 2r ) and Eff^) 2 = 0(a 2r ) 
We also have KH^ n (W 1 ,W 2 ) = O^na)- 1 ), E#| n (Wi, W 2 ) = 0{{na)- 2 ) and 
nE[E {1} H 5n (W u W 2 )) 2 = 0((na)~ r ) = nE[E {2} H 5n (W 1 ,W 2 )} 2 . 

Appendix C: Proof of Proposition 3.4 

Set 

b(v, x) = [ V "/"'^ Mu, x) - s<°) (u, ft, x)]d« , 



/o S (°)(n,/3 ,x) 

where (u, ft , x) =KS^J(u,Po,x), j(u,x) = n(du,x)/a(u,x) and n(v, x) = 
T, m EN im (v)K n (x,X im ). Then y/na[A(v; x, ft) - A (v, x) - b(v, x)} =Z n (v,x) + 
Rn(v, x), where 

Z n (v,x) = \ f^J2Yl I Tblfe M im{du) + R n (v, x) , 
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and R n (v, x) is a remainder term given below. Under conditions A.r, r = 1, 2, we 
have y/nab(v,x) = 0{y/naa r ) = o(l). Therefore it is enough to show that the 
process Z n (v,x) converges in £°°([0,To]) to a time transformed Brownian motion 
and the remainder term R n is asymptotically negligible. 
We have Z n (v,x) = y/n[P n — P]h n:V where 

The class 7i n = {h nv : v < tq} consists of functions that can be represented as 
a linear combination of at most four monotone functions with respect to v and 
has envelope 4H 0n (Wi). By Lemma 6.9 we have (i) 'EHl n {W\) = 0(1) and (ii) 
E H 0n (W\)l (H 0n (W\ ) > r)Jn) < EHUW^^)- 1 -► for any r? > 0. Also 
(hi) for any < v\ < vi < tq, the difference \h nvi — h nV2 \(Wi) is bounded by 



A - J J V1 sW)(u,0o,x) V a Jv 1 



Using (x + y) 2 < 2(x 2 + y 2 ) and Lemma 2.1, E \h nvi — h nV2 \ 2 (W\) is bounded by 
2 p [ S (°)a](u,(3 ,w) „ 2f _ ^ ^ , 8 p p 

'VI p= l 



/ / — mT7 — 5 — , -r 2 -K 2 (x,w)dwdu+- / E TT /^(up, Po, x)du 1 du 2 , 

J Vl J VI i 

and is of order 0(\v2 — i>i| + |t>2 — fi] 2 )- Lemmas 2.1 and 3.2, imply (iv) 

Var[Z n (vi,x)] = d p{x)Mx) (K) / - - du + O(a) , 

JO "S ^lt) x, po J 

and cov[Z n (vi,x), Z n (v2, x)— Z n (y\, x)] = 0(a). Finally, (v) the class of functions 
{h nv : i> < To} has polynomial bracketing number. Properties (i)-(v) and Theo- 
rem 2.11.23 in van der Vaart and Wellner (1996) imply that {Z n (v,x) : v £ to} 
converges weakly £°°([0,to]) to a tight Gaussian process. 

The remainder term R n (v,x) is given by R n (v,x) = Y^j=i Rjn{v,x) where 

1 n r v ~ 

Ri n (v,x) = —=y^ f ni (u,x)du- y/nab(v,x) , 
y/na ^ J 

r~ — pi) 

R2n(v,x) = TV^^Z [fni-^fni](u,x)[g nj -Eg nj ](du,x)], 

n[n - i)a J 



A MODULATED RENEWAL PROCESS 



21 



R 3n (v,x) 




v g(0) _ fl (0) 



](u,Po,x) [g ni - E g ni ] (du,x) , 



R in (v,x) 



R 5n (v,x) 



R 6n (v,x) 




Jo 



where 



fni{u, x) 



[ S (°)(n,/3 ,a ; )]- 1 ^y im H e ^^HK n (x,X, m ) 



m 



fni{u,x) = [ S (°)(n,/3 ,x)]- 1 ^[a( U ,^ m )-a(n,x)]y im (u)e^ z ™("^ n (x,X im ) , 



The term R\ n has mean zero. By decomposing the integrands and the in- 



b(v,x) = ¥ n hinv where hi nv (Wi) is a sum of four monotone functions, bounded 
by H\ n (Wi). Thus Ri n (v,x) is a normalized empirical process over a Euclidean 
class of functions for envelope 4i?i n (Wj). By Lemmas 6.9 and 5.7, we have 
E Hi n (Wi) 2 = 0{a?) and E sup^ \Ri n (v,x)\ = 0(a). Similarly, using envelopes 
H^n and H± n , we can show that E sup^ \Rz n {y, x)\ = 0(a r ) = E sup„ |i?4 n (t>,x)| 
and i?5„(f,x) = 0(y/naa 2r ) a.s. uniformly in v < tq. The term term R 2n is 
easily seen to form a canonical U-process of degree 2 over a Euclidean class 
of functions with envelope H 2n (Wi,Wj) = H 2n (W i ,W j ) +E {1} H 2n (W i ,W j ) + 
E { 2 }H2n(Wi,Wj) +E H 2n (Wi,Wj). Lemmas 6.9 and 5.7 imply E sup„ \R 2n (v,x)\ 
= Odna)- 1 / 2 )), since EH 2 n (Wi,Wj) = O^nay 1 ) and E[H 2n ] 2 {W 1 ,W 2 ) is of 
the same order. 
Next define 



m 





tegrators into their positive and negative parts, we have (na) 1 ^ 2 R\ n {v 1 x) + 




< 2^a~0(a 2r )—Y, g ni (T ,(3 ,x)+O(l)(R 7n . )1 +R 7n . )2 ) , 



i=i 
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where R 7n -i = y/nda 3 U Tti3 (/i), R 7n - 2 = y/na(na 3 ) 1 U n)2 (/i), 

/•TO 

h(Wi,Wj,W k ) = / [(f 0nj -Kf 0nj )(f 0nk -Ef 0nk )](u,Po,x)g ni (du,(3 ,x) 
Jo 

and h(Wi, Wj) = h(Wi, Wj, Wj). The first term is of order O p {^/nda 2r ). We have 
EH(W 1 ,W 2 ,W 3 ) = and, using Lemmas 6.9 and 5.7, E (na) 1 / 2 a -3|u n)3 ( 7 r 3 ^)| = 
0{{na)- 1 ) and E (rw^oT^U^topE { 23 }h])| = O^na) -1 / 2 ). Xhe remaining 
projections are 0. In the case of the term R 7n - 2 , we have Ei?7„ ; 2 = 0((na) -1 / 2 ) 
and the expected E |i?7 n; 2| is of the same order. 

We consider now term Rq u . For e G (0, 1), define 

1 s(°) 1 

^n(e) = {t— < mm in f -?7vr( n '^' x ) - max SU P —(?))(. u ^^ x ) ^ i } • 

1 + e i «<^o ow « „< 1 — e 

/see D -j /sie -» 

We have P(^„(e)) < min; P(sup u < r IS^/^ - < e) -► 1, by condi- 

tion B and Markov's inequality. On the event 0, n (e), we also have 
sup„< T0 \R6 n (v,x)\ < (1 - £) _1 i? 7n . Therefore i^sup^, \Rqu(v, x) \ > rj) < 
P(n c n (e)) + P(sup v < T0 \Re n (v,x)\ > T/,fi n (e)) < P(Q°(e)) + P(R 7n > (l-e)v) - 
for any i] > 0. 

Finally, suppose that (3 is a y^n consistent estimate of the parameter j3. Then 
\fna[A(v, x, (5) — A(v, x, /3o)] = 

vHS-A)]^ I" -^(u,P*,x)N i (du,x) , (6.1) 
Vo [Si/] 2 

where /3* is between (3q and /3. Let /«(/?) = a^ 2 U n2 {hfj), where hp(Wi,Wj) = 
Jo° fini{ u ifii x )~9nj{du,(3,x). It is easy to see that EI n (P) = 0(1). By Lipschitz 
continuity of the function hp with respect to (3, I n {0) is a U-process of degree 
2 over a Eulidean class of functions for envelope H^ n (Wi, Wj). By Lemmas 
6.9 and 5.7, E su P/3eB |a- 2 U n , 2 (vr 2 ^)| = 0([E H^{W U W 2 ) 2 ) 1/2 ) = O^a)" 1 ), 
E sup /3eB |a- 2 [/i n (7riE {1} /i /3 )| = O^na)" 1 / 2 )), E su P/3eB \a- 2 U ln (inE {2} hp)\ = 
0((na) -1 / 2 )). Therefore sup^ eS \I n (f3)\ = O p (l). Further, if /3 is a -y/n consistent 
estimate of /3o, then y/n[(3 — (3q\ = O p {\). To show that the right-hand side 
of (6.1) is of order O p (y / a), it is enough to note that for any e S (0,1), the 
supremum sup{| f£ S^J[S^J]~ 2 (u, (3,x)Ni(du,x)\ : v < to, [5 G £>} is bounded by 
(1 — e)~ 2 sup^gg I n (f3) on the event O(e). 
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Appendix D: Proof of Propositions 3.3 and 3.5 

Define 

~ i n r 

<D 0ri (/3 ) = "^EE [Zi m (u)s { °j(u,p ,X im )-S { }l(u,po,X im )}M im (du) , 
v n i=i m ^ 

i n r 

- Z irn {u) 

n i=l m J 

I n r (!) 
$0n(A)) = ~/^EE / ^ Z ^ u ) ~ ^j(u, Po, X im )}M im (du) , 

* i=l m 

i n /* s ( 2 ) AwN® 2 

S 0n(/3) = -EE/[^)-b) ](n,/3,X im )AT im (^), 

i=l m V / 



i r° si - 1 si 1 -* 

$lra(A>) = -/=E / [^m(u)-^y(«,/?0,^m) («, A), ^Qm)] N im (du) , 

V m JO 

<J>2n(/3o) = "-^EE r^ m ^ 



s (l) 



S (0) _ a(0 ) 
X ^ — 7(0) )(u,Po,X im )N im (du) , 

1 " ^ r° - s (1) 5 (0)) - s (0) 
$ 3 „(/3o) = ^EE/ ( " (0) )(^-fo) )(u,Po,X tm )N im (dn) 

V n „-_i ™ JO sw sw 



j=l m 



*4n(/%) = - T £E — [(-?5r-l) 2i 7oT](«.A),^im)^m(d«), 

v n i=1 m JO « S"!/ 

1 n . ^(2) _ g (2) 

= -EE/[— 7(0) ^-^]KA^m)iV im (du), 

i=l m 

S 2n(/?) = ""EE /[ ">) ^(^.^l^W , 
i=l m 

S 3 „(/3) = - EE - ^<m)^im(d«) , 

^ i=l m 7 

where ^ = (S^ _ s -i) g, s (i)/[ a (o)]2 ) = " (siVM*?)® 2 and 

Under assumptions of Proposition 2.3, is the negative derivative of the 

score function $„(/?). Similarly, under assumptions of Proposition 2.5, we have 
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®n(Po) = £j=i$jn(A)) and S n (/3) = X)f=i ^nj(P) is the negative derivative of 
the score function <& n {P). The proof of both propositions amounts to application 
of the following lemma and results of Bickel et al ((1993), p. 517). 

Lemma 7.10 (i) Under assumptions of Proposition 2.3 we have $on(A)) 
Af(0,£ 2 (A))), £„(/?o) -p Si(A)), - $On(/?o) -p 0, and 

sup{|S n (/3) - £(A))| : |/3 - A>| < £«} -p 0. 

(ii) Under assumptions of Proposition 2.5 we have «J?on(A)) =^ A/"(0, E(/?o)), 
S n(/3o) -p E(A>), $m(A)) - $0n(A>) -p 0, $ fcn (/3 ) -p for fc = 2,3,4, 
£fcn(A)) -p for A; = 1,2,3, and sup{|S fcn (/3) - S fcn (/3 )| : |/3 - Po\ < 
e n } -» P for k = 0,1,2,3. 

Proof . First note that under the assumed regularity conditions, asymptotic 
normality of the terms &on(Po) an d ®on(Po) follows from CLT. 

We show that - $on(A>) -»p and $i„(/%) - ^On(Po) -»p 0. For 

any bounded function <p(u, x), let Gfj = G v (Wi, Wj) be given by 

G % = J2 <p(u,X im )[Z im (u)sf\u,P,X im ) - sf(u,f3,X im )]N m (du) 

Under assumptions of Proposition 2.3, we have 3> n (/?o) = 3>on(A)) + Op(y/na) + 
^n^i^G^)) for (p = 1. Similarly, under assumptions of Proposition 2.5, we have 
*m(A>) = ^On(/3o) + Op{^ia 2 ) + U ni2 (7r 2 G^) for <^u,x) = [^(m,/*,,*)]" 1 . 
Thus it is enough to show that in both cases EU„ )2 (71^(7^) = 0((na)~ l l 2 ). 
Choose ip = [s( )] _1 for instance, and define 

G n (W i ,W j )) = aT 1 Y,Y, [ \ Z irn(u)\ p 7i- pj n(u,Po,X im )M im (du) 

+ E / E l^ m | P (^^m(^)e /3 ° Z ' m(U) 7l- P ,in(^, A), *imM«, X iro )d« 

/ [\Z im (u)\f nj (u,Po,X im ) + fi nj (u,Po,X im )]N im (du) . 
We have EU^faG*')) = 0(n- l l 2 (^G 2 n (W u W 2 )) 1/2 ) = 0{{na)- l l 2 ) because, 
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by Lemma 2.1, the expectation KG n (Wi, W2) is bounded by 
4 f T " f T — 

~Z2 / a pA U > U > X ) E [fl-pjn( u >Po> x )] 2a ( U ' X ) dudx + 

° P =o ^ 

2 2 

/•TO /-TO /*T f _ f 

/ / / "p,p( u i^2,x)E[TT/ 1 ■ n (u J ,/3o,x)]TTa(itj,x)duidu 2 rfa; + 
Jo Jo Jo , =1 /=1 

/•TO /-TO fT fT _2 ^ _2 

/ / / / Pp; P (lt,x)E[n/ 1 _ pJn (^,/?o,^)]]T a{u h xi)duidxi 
Jo Jo Jo Jo iJ[ " 



Here in the last line u = (^1,1*2) and a; = (xi,x 2 ). By Lemma 6.8, the bound is 
of order 0(a~ l ). It follows now that $i n (A)) - $0n(A)) — >p 0. 

The same argument applied to the function <p(u, x) = 1 shows that $ n (A)) — 
$0n ->p 0. Changing the risk processes —Sj k+1 \ k = 0,1, in the definition 
of Gf(Wi, Wj), we also obtain 



E n (/%) - Si(A)) 

fTD 

- I \ 1 

n 

" 'o 



/•TO 

: V / Z im (u)s^(u,p ,X im ) - s^(u,(3 ,X im )M im (du) +o p (l) 

™ JO 



The Strong Law of Large Numbers implies that S(/?o) — >p £i(/?o)- Components 
of the matrix are Lipschitz continuous in (3, and it is easy to verify that 

|E n (/3) — S n (/3')| < |/3 — /3 / |U ni 2(G2n) where G 2n is a kernel degree 2 satisfying 
E |U nj 2(G ! 2n)| = O(l). This completes the proof of the first part of the proposi- 
tion. 

Further, the terms <3?2n(/3o) an d £i n (A)) are U-statistics of degree 2. Using 
similar algebra as in the case of the difference $i„ — <3?on, we can show that they 
converge to in probability. 

Next define 

i?ln = ^5Z5Z / r , (« ; ^im)JJ( — )(u,Po,X im )N im (du) 

71 i=i m ^° fc=l 

where ip(u,x) is a bounded function and p, q = or 1. We have \fnfl\ n = 
O p (^Ea 2 ) + 0(l)[VHa- 2 U n , 3 (#) + v^(ra 2 ) _1 U„,2(5)], where 

i2"(Wj, Wj, W k ) = }2 ^( u i X im)[fpjn-^fpjn][fqkn-'Efqkn]{u,Po,Xi m )Ni rn (du) 
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and H(Wi, Wj) = H(W U W h Wj). We have EH(W U W 2 , W 3 ) = 0. Lemmas 5.7, 
6.8 and 6.9 imply that E ^a- 2 |U„, 3 (vr3i?)| = 0{{na)- 1 ) and O^na)- 1 / 2 ) = 
Ev / n(na 2 ) _1 |U n) 2(vr2E{23}-ff)| while the remaining projections are 0. Further, 
^(na^EUn^H) = O^na 2 )- 1 / 2 ) = > /n(no 2 )- 1 U n>2 (|fl ! |) so that the condi- 
tion na 2 | oo implies asymptotic negligibility of the third term of y/nHi n . 

The choice of</? = l,p = l,<7 = implies that if na 4 j and na 2 | oo then 
$3n(A)) -^p 0. The choice of ip = 1 and p = g = 1 implies £2™ (A)) — >p 0. 

To handle the term 3>4 n (/?o) define 

1 n /" T( » 7 S (0) - s (0) 

^ = - EE / ) 2 )(«,/%,x im )Ai m (d«) . 

Using (x + y) 2 < 2x 2 + 2y 2 , we have s/n\H 2n \ < 20 p {y/na 4 ) + 2 v / nF 2 n;i + 
2^/nH 2n;2 , where H 2n -i corresponds the sum H\ n applied with function ip = 
E fi n i/s(°\ and H 2n;2 is a V statistics of degree 4: H 2n - 2 = 0(l)[a~ 3 V n ^(h) + 
(a^n^Un^h) +2(na 3 )- 1 U n , 3 (/i') + (n 2 a 3 )- 1 U„, 2 (/i ,, )] ) where 

/ i (^,W j ,W jfc ,^) = E r^ijn-^7i jn ] II [/o P n-E/ 0p „](n,/3o,X im )iV im (^) 

and ^(Wi.Wj.Wfc) = KWi,Wj,W k ,W k ), h'{W ll W j ,W k ) = h(W h W h W h W k ), 
h"{Wi,Wj) = h(yVi,Wj,Wj,Wj). We have E |^(n 2 a 3 )- 1 U ni2 (/i")| < 
v / n(na 3 ) _1 EU ni 2 1 h" | which is bounded by 

-2~3 / -^fijnlifojn - E f Qjn ] 2 \(u, f3 , x)s(°\u, f3 , x)a(u, x)dudx 

n a Jji 

Under conditions D.2 (ii) this bound is of order 0(n~ 3 / 2 a~ 2 ) and tends to if 
na 2 | oo. A similar argument shows also that the second and third term of 
\/nH 2n - 2 have expectation tending to when na 2 f oo and na 4 [ 0. The first 
term has expectation 0. By Lemmas 5.7 and 6.9, we have E y/na^^inir^ = 
0((na)~ 3 / 2 ), E^a _3 |U„,3(vr 3 E { 234}/i)| = 0({na)~ l ), while the remaining pro- 
jections are 0. 

Further, for e € (0, 1), define 

1 1 
^n(e) = {t-— < min , inf -?7it( u '^' x ) - max SU P -77w( u '^' x ) - 1 J" 

/3es °-i ' P 1 B °-i 
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As in the proof of Proposition 3.4, the condition C implies P(Q n (e)) — > 1. Also 
on the event Cl n (s), the term <3?4 ra (A)) satisfies |<J?4 n (/3o| < (1 — e)~ 1 y/nH2 n - For 
any r? > 0, we have P(|$ 4 n(A))| > v) < P(V^H 2n > fj,fi(e)) + P(fi£(e)) < 
P(V^H 2n > (1 - + P(fi£(e)) - 0. 

Application of the condition C shows also that S3 n (/?o) ->p 0. Finally, it is 
easy to verify that the matrices k = 0,1,2,3, satisfy \Y, n k{(3) — £ n fc(A))| < 
|/3 — (3' | Op (1), which completes the proof of the lemma. 
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Table 3.1. Polynomial kernels of degree 2, 4 and 6 





interior 


(3/4)(l-x 2 ) 




left 


6(p + a:)(p + g)" 4 [p 2 - 2pg + 3g 2 + 2z(p - q)] 




right 


6(q - x)(p + q)- 4 [3p 2 - 2pq + q 2 + 2x(2p - q)] 


H = 2 


interior 


(15/16)(l-a; 2 ) 2 




left 


60(<7 -x){p + xf{p + q)~ 6 \p 2 - 2pq + 2q 2 + (2p - 3q)x] 




right 


60(q - x) 2 (p + x)(p + q)- 6 [2p 2 - 2pq + q 2 + {3p - 2q)x] 


/i = 3 


interior 


(35/32)(l-x 2 ) 3 




left 


140(q - x) 2 (p + xf[ip 2 - 6pq + 5q 2 + 2(3p - 4q)x] 




right 


U0(q - xf{p + x) 2 [5p 2 - 6pq + 3q 2 + 2{Ap - 3q)x] 
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Table 4.1. Regression estimates and standard errors of direct transitions 





TX -> AGVHD 


TX -> CGVHD 


AGVHD -> CGVHD 


sex-match 


.08 (.05) 


.12 (.05) 




CSA 


.46 (.08) 


.18 (.12) 




Trcm 


- .58 (.13) 


-.42 (.15) 


-.28 (.24) 


MTX 


.38 (.20) 


-.40. (.31) 




disease 


.12 (.07) 




-.13 (.13) 




TX — > relapse 


AGVHD -> relapse 


CGVHD -> relapse 


sex-match 


-.11 (.10) 




.21 (.10) 


CSA 




-.52 (.31) 




Trcm 


.20 (.12) 


-.75 (.59) 


.40 (.31) 


MTX 


.32 (.23) 


.67 (.56) 




disease 


.14 (.10) 








TX -» death 


AGVHD -> death 


CGVHD -> death 


Trcm 


.23 (.15) 


.57 (.20) 


.48 (.25) 


CSA 


-.25 (.18) 






MTX 


-1.06 (0.58) 




-.82 (.71) 


disease 




.21 (.13) 




prior AGVHD 






.75 (.16) 



The covariates are binary 0-1 variables: Sex-match = 1 if the donor is a female 
and the recipient is a male. Disease = 1 if the disease type is ALL; Prior AGVHD 
= 1 if AGVHD occurs prior to CGVHD. The GVHD prophylactic treatments are 
labeled as cyclosporin (CSA =1), T cell removal (Trem = 1) and methotraxate 
(MTX=1). 
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Figure captions 

Figure 4.1 Baseline cumulative hazards of transitions originating from the 
transplant state versus age. The labels of states are 1 - transplant (TX), 2 - 
AGVHD, 3 - CGVHD, 4 - relapse and 5 - death. 

Figure 4.2 Baseline cumulative hazards of transitions originating from the 
AGVHD state versus age. The labels of states are 2 -AGVHD, 3 - CGVHD, 4 - 
relapse and 5 -death. 

Figure 4.3 Baseline cumulative hazards of transitions originating from the 
CGVHD state versus age. The labels of states are 3 - CGVHD, 4 - relapse and 
5 -death. 
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